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Abstract. We report a statistical analysis about people agglomeration soundscape. 
Specifically, we investigate the normalized sound amplitudes and intensities that 
emerge from people collective meetings. Our findings support the existence of 
nontrivial dynamics characterized by heavy tail distributions in the sound amplitudes, 
long-range correlations in the sound intensity and non-exponential distributions in the 
return interval distributions. Additionally, motivated by the time-dependent behavior 
present in the volatility /variance series, we compare the observational data with those 
obtained from a minimalist autoregressive stochastic model, a GARCH process, finding 
a good agreement. 
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1. Introduction 

Physicists are now addressing problems very far from their traditional domain. 
Even social phenomena are now ubiquitous in the research made by the statistical 
physicists [1]. In particular, the general framework of the Statistical Physics has 
been successfully applied to diverse interdisciplinary fields ranging from finance [2], 
genetics [3] and biology to religion [5], tournaments [6], cuhnary [7], and music [8]. 

Naturally, in social phenomena the basic constituents of the system is the human. 
Humans are known to have nontrivial collective dynamics, much more complicated than 
idealized physical interacting systems. Moreover, even individual aspects related to 
social agents may not be available. This complex scenario is reflected, in some sense, in 
several human activities. For instance, elections [9], collaborations between actors [10] 
and also between scientists [H], phone text-message [12], mail [131 [H] or email [131 [15] 
communication, human travel [16l[T7], and collective listening habits [T8l [19] are just a 
few examples where complex structures have been found. 

Most of the previous investigation deal with record data obtained directly or 
indirectly from the system, trying to extract some patterns or regularities about 
the system dynamics. This approach has been a trend towards investigating social 
phenomena and also complex systems in general [201 l211 l22l [231 121] • Within this 
framework the most diversified data were used as the sound. "Listen to" the system 
dynamics may be both a simple task and a minimally invasive measurement. In this 
direction, several studies focused on the sound time series have been done. Just to 
mention a few: researches about the acoustic emission from crumpled paper [251 126] , 
from paper fracture [27J, and fractures in general [281 129] show several features 
related to critical phenomena, the power spectrum of music and speech sounds 
presents l//-like spectra [30] and the normalized sound amplitude shows non-Gaussian 
features [31], traffic flows were investigated by using the sound noise revealing scaling 
and memory [32] , avalanches-like dynamics was found in the sound of popping bubbles 
in foams [33] and also in the lung sound [34]. 




In this work, we present an investigation about a very common situation related to 
human collective activities: the people agglomeration. Human beings agglomerations 
can emerge in various places for different reasons, for example, people having lunch 
in restaurant, parties, and working meetings. In all these situations a common and 
notorious feature is perceptible: the resulting sound noise from these agglomerations. 
Here, our main goal is to show that a nontrivial dynamic emerges when analyzing this 
kind of time series. In addition, employing a minimalist model we are able to reproduce 
statistical aspects of the empirical data. In the following, we present the details about 
the data acquisition, the statistical analysis of the data, our minimal model, and finally 
we end with a summary. 





The soundscape dynamics of human agglomeration 



3 




0.0 1.5 3.0 4.5 6.0 7.5 9.0 
time (minutes) 



>, 50 




0.0 1.5 3.0 4.5 6.0 7.5 9.0 
time (minutes) 



Figure 1. (Color online) A representative record sound signal: (a) the normalized 
sound amplitude and (b) the normalized sound intensity. 

2. Data presentation 

The observational data was obtained by recording the soundscape of people 
agglomeration in the recreation time at our university. The meeting point is an open 
place where the students spend time until the next class. All the measurements were 
made by using a condenser microphone (Shure Microflex MX202W/N) positioned in 
the central part of the agglomeration. We employed a sampling rate of 44.1 kHz in 
order to cover the full audible human range (approximately between 20 Hz and 20 
kHz). Additionally, the measurements were made during different periods in nine days 
totaling 16 records. The number of people during the recordings ranged approximately 
between 100 and 200, and these variations does not significantly change the statistical 
results. Typical recording times are about 10 minutes and along the recording the 
number of persons was approximately constant. We also analyzed 10 recordings from a 
web sound databas^ finding similar results when compared with our measurements. 
Figure [U^a) shows a representative record signal where we employed the normalized 
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sound amplitude At, i.e., the sound amplitude subtracted by its mean value and divided 
by its standard deviation. Figure [Tt^b) presents the sound intensity, A^, divided by its 
standard deviation. From these two figures, we can observe the existence of some bursts 
where the sound amplitude and the sound intensity exceed values much larger than 
their standard deviations. Qualitatively, the origin of these extreme events may be, for 
instance, related to the fact that the people want to be heard, and if the neighbors are 
talking out loud they also have to increase the sound intensity. 

3. Statistical analysis 

One of the most direct ways to characterize the sound amplitude is by evaluating its 
probability density function (pdf). We show this analysis in Figure [2t^a) for three 
typical recordings where we also plot one Gaussian distribution with zero mean and 
unitary variance (dashed line). A quite similar behavior has been found for all the 
other realizations of the experiment and also for the web recordings (at least in the 
central part of the distribution). The empirical distribution clearly differs from the 
Gaussian one, especially for larger values of the sound amplitude {\A\ greater than four 
standard deviations). Naturally, this heavy tail behavior reflects the presence of extreme 
events that we qualitatively see in Figure [H 

A possible manner to investigate the dynamics of these extreme events is by 
evaluating the time interval between them. These time intervals can be obtained 
by considering a threshold value q and storing all the initial time ti for which the 
normalized sound intensity is above this edge. The difference between two consecutive 
times Ti = tj+i — ti is the so called return interval. For Gaussian uncorrelated (or weak 
correlated) random variables the distribution of Tj is well known to follows an exponential 
distribution p(r) ~ e""^/"^', where fg is the average value of Tj when considering the 
threshold value q. Additionally, empirical results have shown that, in the presence 
of power law correlations in the data, the distribution is well adjusted by a stretched 
exponential [35l[371[36] or by a Weilbull distribution [38], i-e., 

p{t) ~ e-^("/^^)' or p{t) ~ (r/f,)'^-' g-^^"/^')' , (1) 

where A and B are constants and 7 is the exponent of the power law autocorrelation 
function. These distributions also emerge in the analytical framework of Santhanam and 
Kantz [39] when considering a long-range correlated noise with Gaussian pdf. Notice 
that all these distributions are dependent on q, but if we employ the scaled variable Xj/f 
this dependence is eliminated. 

Before we investigate the return intervals, let us address the correlation question by 
using the detrended fluctuation analysis (DFA) [10|. This technique basically considers 
the root mean square fluctuation function F{n) (see for instance [H]) for the integrated 
and detrended time series for different values of the time scale n. When the data present 
scale-invariance properties, F{n) follows a power law F{n) ~ n^, where h indicates the 
degree of correlation in the time series: if h = 0.5 the series is uncorrelated and if h > 0.5 
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Figure 2. (Color online) (a) Probability distribution of the normalized sound 
amplitude A for three realizations of the experiment (squares, circles and triangles) 
confronted with the standard Gaussian pdf (dashed line) and with the GARCH 
model (continuous line), (b) DFA analysis when considering the same three previous 
realizations for the normalized sound intensity: logiQ[F{n)] versus log]^g(n) in 
comparison with a linear fit (dashed line), where we found F{n) oc n'^ with h w 0.88, 
and with the GARCH model (continuous line). Here n is in units of 1/44. Ifc seconds, 
(c) Return interval distribution take into account one realization of the experiment for 
three threshold values: q — I (squares), q — 2 (circles) and q = 5 (triangles) compared 
with the stretched exponential (dashed-doted line) and the Weilbull distribution 
(dashed line) of equation [T] with 7 = 2(1 — h) = 0.24, and also with the GARCH model 
(continuous line), (d) Volatility distribution for one realization of the experiment and 
considering five window sizes: w — 1 (squares), w — 2 (circles), w = 5 (triangles), 
w = 10 (diamonds) and w — 100 (pentagons) hundredths of a second. The dashed line 
is a power law with p{v) oc w"^-^ and the continuous line is the GARCH prediction. 



the series is long-range correlated. Figure El^b) shows the fluctuation function versus 
n for the same three recordings of Figure M^a) where we found h ^ 0.88 indicating 
that long-range correlations are present in the data. Note that the three curves are 
practically identical. This fact is evidenced by evaluating the mean value of h (h) and 
its standard deviation (ah) over the 16 recordings finding h = 0.88 and ah = 0.001. 
When considering the web recordings this values remain close: h = 0.89 and ct/j = 0.01. 

Now, advancing with the return interval distribution, it is interesting to emphasize 
that the exponents h and 7 are related via 7 = 2{l — h). Moreover, since the distribution 
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of Tj/f should be normalized and also have unitary mean, the only fit parameter is 7 
that can be obtained from h leading to 7 ~ 0.24. Figure [2t^c) shows this distribution 
for three values of q where we can observe a reasonable data collapse but a not so good 
agreement with the distributions of equation [H Similar situation have been recently 
observed when considering non-Gaussian distributions related to water boiling [12]. 

We can also investigate the bursts observed in Figure [T](a) by evaluating the 
volatility of the normalized sound amplitude. This time series refers to the local 
standard deviation of A{t) estimated over a time window w = nAt, i.e., 

^ t+n-l 

vl{t) = T {A{t') - {A{t)U\ (2) 

t'=t 

where {A{t))w = ^Ylt^=t'^ is a integer and At is the sampling time interval. 

Figure EJj^d) shows the volatility distribution of our empirical data for time windows 
ranging from 1/100 to 1 second. Notice that we found a good collapse of data and that 
this distribution has an asymptotic power law decay characterized by a exponent 77 = 4.1. 
The mean value and the standard deviation of r] calculated over the 16 realizations are 
respectively fj = 4.29 and o"^ = 0.35 (f/ = 4.90 and cr^ = 1.10 for the web recordings). 

4. Modeling 

Our starting point to model the data behavior is the non-stationary aspect of the 
volatihty. Figure |2]^d) supports the conclusion that the volatility of the sound amphtude 
is a time-dependent stochastic process and Figure [2]^b) indicates that long-term memory 
are present in sound intensity series. This feature is very common in financial data where 
the volatility (or risk) is one of most essential ingredients in the price dynamics. In this 
scenario, much work has been done [13] and consequently a large amount of models 
are available. From a qualitative point of view, the interactions (competitions) among 
people present financial markets seems to be similar to the ones existent in our social 
system. This picture motives us to employ a typical financial model to our data. 

One of these models is the generalized autoregressive conditional heteroskedastic 
processes or simply the GARCH process. This model was proposed [H] (at least in part) 
to take into account the long memory typically found in financial data. It is defined in 
its most general form, GARCH(p, g), by 

a1 = ao + + • • • + cypxl_p + Picy'^^i + ■■■ + Pqcr'^-q , (3) 

where Ui and /3j are positive control parameters and is a uncorrelated random variable 
with zero mean and unitary variance. Thus, the GARCH process is uncorrelated in xt 
but correlated in the variance. Also note that for = the GARCH recovery the so 
called ARCH process [15] . 

Here, for simplicity and also for satisfactoriness we will focus on the GARCH(1, 1) 
process 
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=ao + aiXt_^ + /3ia^_^ , (4) 

for which we choose the distribution of to follow the standard Gaussian. After this 
simplification the model have three parameters: ao, «i and However, since the sound 
amplitude is scaled to a unitary variance, we can eliminate one of these parameters by 
using the expected variance of the GARCH(1, 1) process Xt: 

= ~\ 7. T ■ 

i — «! — Pl 

In this manner, we have now two parameters that we incrementally update to minimize, 
via the method of least squares, the difference between the simulated values of sound 
amplitude and the observational ones. The best values for the parameters are ai = 0.011 
and Pi = 0.9889 leading to ao = 0.001 since ax = 1. The comparison with the empirical 
data is shown in Figure [2l where the GARCH(1, 1) predictions are indicated by the 
continuous lines. We can see that the agreement between the data and the GARCH(1, 1) 
is very good. 

Concerning Figur^b), where we compare the DFA analysis, we have to remark 
that the autocorrelation function of the variable xf is not really long-range correlated. 
In fact, it has an exponential decay [33], i.e., {x^x^j^^) ~ exp(— t/rc), where Tc = 
I ln(a;i + However, the GARCH(1, 1) process can mimic the long-range decay for 

large values of the characteristic time Tc. This feature can be achieved by choosing the 
sum ai + (3i closer the unity. In our case, ai + f3i = 0.9999 leading to characteristic 
time Tc ~ 10^ seconds, which is very large mimicking at least in part the long-range 
correlations. Notice that the empirical data also present deviations from the straight 
line suggesting that correlations present in the data may have a kind of exponential 
cutoff. 



5. Summary 

In this work we investigated some statistical aspects of the collective sound emitted by 
people when they are agglomerated in a meeting place. Empirical evidences showed 
that (i) the normahzed sound amplitude is not Gaussian distributed, (ii) the sound 
intensity presents long-range correlations, (iii) the return interval distribution of the 
sound intensity is not exponential, and (iv) the volatility of the sound amplitude is non- 
stationary having a power law tail in its distribution. Motivated by the time dependence 
of the volatility, we compared the observational quantities with the predictions of the 
GARCH(1, 1) model, finding a good agreement with all of them. 

Before concluding, we would like to point out some possible mechanisms responsible 
by the presence of heavy tail distributions and long-term correlations in the data. 
The first one is related to the fact that humans already have an intrinsic complex 
behavior which may manifest in our measurements. Second, these individuals form small 
interacting groups adding more complexity to the system. On a third level, there is also 
an emergence of interactions between groups. Naturally, more detailed measurements 



The soundscape dynamics of human agglomeration 



8 



and models should be considered, in comparison with those one presented here, to obtain 
a broad understanding of this system. 
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